Agentic SDLC

Why it was built this way

After 5 years in enterprise IT operations, I watched the same failure pattern repeat across every support queue I managed. Bugs were rarely caused by bad code in isolation. They came from handoff failures: a planner who did not communicate constraints to the implementer, a reviewer who missed the edge case the tester would have caught, documentation that was never updated after the fix.

When I started learning AI engineering in 2024, the obvious first question was: can agents fix this? Not code autocomplete. Not a chatbot that writes functions. A full autonomous pipeline where a plain-English spec flows through planning, implementation, review, testing, documentation, and commit without a human touching the keyboard between stages.

That is what the AI-Powered SDLC Engine is. Node.js, six agents, SQLite state, nine-provider failover. It runs today.

How it works: step by step

Step 1: Researcher maps the target repository using AST-based code graph. Ingests the full codebase with Moonshot's 256K-window model (moonshot-v1-128k), identifies which files are relevant to the goal, writes a structured context map to the Blackboard.
Step 2: Perceptor reads the Researcher's output and triages the goal: Is it a bug fix, a new feature, a refactor? Scores complexity and sets priority flags that constrain what the Architect is allowed to plan.
Step 3: Architect produces a step-by-step implementation plan: which files to touch, what functions to add or modify, what edge cases to cover. Writes this as structured JSON to the Blackboard that the Coder reads directly.
Step 4: Coder executes the plan. Uses kimi-k2.6 as primary (high-fidelity agentic code generation), falling over to Bedrock Qwen3-Coder, then the rest of the nine-provider chain.
Step 5: Auditor (up to 3 iterations) reviews the Coder's output against the original plan and DNA contracts. If it finds issues, it sends the Coder back with specific instructions. After three clean passes or three failures, it exits the loop with a verdict.
Step 6: Documenter + auto-commit reads the final diff, updates the relevant docs and README sections, then the engine commits everything atomically with a structured commit message.

Five engineering decisions worth reading

1. Per-role model allocation

Every agent gets the model that fits its job, not a one-size-fits-all default. Researcher needs a 256K window to ingest a full codebase in one shot. Perceptor uses a fast 8K model because triage is cheap. Coder and Auditor use kimi-k2.6 because code generation needs the highest-fidelity reasoning available.

// From BaseAgent.js - actual production allocation
// Researcher: moonshot-v1-128k  (256K window for repo-wide ingestion)
// Perceptor:  moonshot-v1-8k    (fast triage and priority scoring)
// Architect:  moonshot-v1-8k    (rapid planning and critique)
// Coder:      kimi-k2.6         (high-fidelity agentic code generation)
// Auditor:    kimi-k2.6         (deep verification and DNA compliance)

2. MoonshotLimiter: RPM + concurrency semaphore

Moonshot Tier 0 allows 20 RPM and concurrency 3. Naive clients exceed both constantly. MoonshotLimiter maintains a sliding 60-second RPM window and a concurrency semaphore. Requests queue cleanly instead of failing with 429s.

class MoonshotLimiter {
  constructor() {
    this.active    = 0;
    this.maxActive = 3;
    this.queue     = [];
    this.history   = [];
  }
  async acquire() {
    return new Promise(resolve => {
      this.queue.push(resolve);
      this._pump();
    });
  }
  async _pump() {
    if (this.queue.length === 0 || this.active >= this.maxActive) return;
    const now = Date.now();
    this.history = this.history.filter(t => now - t < 60_000);
    if (this.history.length >= 20) {
      const delay = (this.history[0] + 60_000) - now + 100;
      setTimeout(() => this._pump(), delay);
      return;
    }
    this.active++;
    this.history.push(Date.now());
    this.queue.shift()();
  }
  release() { this.active--; this._pump(); }
}

3. Session-level circuit breaker

Once a provider exhausts every model in its chain during a run, it is skipped for the rest of the session. No wasted tokens retrying providers already known to be down. Operator manual overrides via a providerOverrides singleton let you flip a provider out of rotation without restarting the process.

4. SHA-256 response cache

Cache key is SHA-256(systemPrompt + '\x00' + prompt). Exact match required, so stale results are impossible. Saves real tokens on the Auditor re-reviewing unchanged files, or the Researcher fetching the same query twice in one session.

function _cacheKey(sys, prompt) {
  return crypto
    .createHash('sha256')
    .update(sys + '\x00' + prompt)
    .digest('hex');
}

5. Durable Blackboard (SQLite)

All agent state lives in Blackboard.js backed by SQLite with full ACID guarantees. If the process crashes mid-pipeline, state survives. Agents never pass context through function arguments or in-memory globals — everything flows through Blackboard methods. This enforces a clean separation between agents and makes the full run observable and replayable.

Two modes

# Self-mode: operates on its own repo
node index.js --goal "add a health check endpoint"

# External mode: operates on any target project
node index.js --project /path/to/target --goal "fix the failing tests"

The same pipeline that improves AI-SDLC itself can be pointed at any other Node.js repo.

Quick start

# Repo private during beta — request access from Admin (adiyogibooks@gmail.com)
cd ai-sdlc
npm install
cp .env.example .env   # add at least one LLM API key
npm test               # 137 tests, no external services needed
node index.js --goal "add a README badge"

Ollama is the local final-fallback — no API key required. If you have Ollama running at localhost:11434, the pipeline keeps working even with no cloud keys set.

The AI-Powered SDLC Engine is self-hosted. You clone it, add API keys, and run it on your own machine or server. That works for a solo developer. It does not work for a hosted product where strangers submit repositories and expect isolated, safe execution.

The AI-Powered Dev Platform is the hosted product wrapper. Node.js, Express, Postgres, Redis, E2B sandboxing, JWT auth, billing. It extends the SDLC Engine directly — same language, same codebase, no rebuild. Here is the architectural reasoning behind each choice:

E2B sandboxing is mandatory, not optional. FileGuardian and git isolation protect the server from the engine's own actions. They cannot protect you when a stranger submits code that intentionally tries to escape scope. E2B gives each job a disposable VM that cannot reach shared infrastructure. This was the key lesson from 5 years of watching what happens when execution escapes its intended scope at 2am.
Postgres replaces SQLite at scale. SQLite breaks under concurrent write load. For a multi-tenant product running dozens of jobs simultaneously, Postgres with proper connection pooling is the correct store.
Redis queue for job isolation. Each submitted job gets a dedicated SSE queue. Results stream back to the submitting client without leaking state across sessions.
Agent rename is a UX decision. The SDLC Engine agents are named for what they do internally (Researcher, Perceptor, Architect). The Dev Platform agents are named for what they look like to a customer (Planner, Implementer, Reviewer). Same pipeline logic, different interface contract.

"The AI-Powered SDLC Engine is the hard part and it is done. The Python hosting wrapper is standard SaaS plumbing. The engineering challenge was the engine. The Dev Platform is a product decision."

Product Engine Architecture

The hosted product takes a feature specification and resolves it through a Directed Acyclic Graph of five product-facing agents. The DAG is not a fixed linear chain: the scheduler calculates execution order at runtime, fires independent agents concurrently, and enforces gates so no agent starts before its dependencies complete. Test Author and Doc Writer fire simultaneously after Reviewer completes because they have no dependency on each other.

AI-Powered SDLC Engine

Why it was built this way

How it works: step by step

Five engineering decisions worth reading

1. Per-role model allocation

2. MoonshotLimiter: RPM + concurrency semaphore

3. Session-level circuit breaker

4. SHA-256 response cache

5. Durable Blackboard (SQLite)

Two modes

Quick start

The Six-Agent Pipeline

AI-Powered Dev Platform

Product Engine Architecture

Core Pipeline Flow & Sandbox States

Kahn's DAG Scheduler

LLM Failover Router

SHA-256 Idempotency Cache